-
Notifications
You must be signed in to change notification settings - Fork 251
SBOM generator #1682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
SBOM generator #1682
Conversation
Signed-off-by: Scott R. Shinn <[email protected]>
Signed-off-by: Scott R. Shinn <[email protected]>
Adding SBOM module
This update includes: - CycloneDX 1.5 schema compliance - Build hardening metadata capture (FORTIFY, PIE, RELRO, etc.) - Improved dependency tracking and PURL/CPE support - Updated documentation for CycloneDX support - New default configuration options in config.py Signed-off-by: Scott R. Shinn <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new SBOM generator plugin for mock, which is a great addition for improving software supply chain security. The implementation is comprehensive, capturing build environment details, package metadata, and dependencies in CycloneDX format. The documentation is also very detailed.
My review focuses on a few areas for improvement:
- There's a duplicated and likely outdated plugin file (
sbom_generator.py) in the root directory that should be removed. - The documentation has several inconsistencies with the implementation, particularly regarding the SBOM filename and default configuration values.
- The plugin code contains numerous
print()statements for debugging which should be replaced with proper logging. - There are some minor code style issues like imports inside functions and a hardcoded version number.
Overall, this is a solid contribution. Addressing these points will improve the quality and maintainability of the new plugin.
|
Can you add config option |
|
Please add release notes using https://github.com/rpm-software-management/mock/blob/main/docs/Release-Notes-New-Entry.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vcs-diff-lint found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
Signed-off-by: Scott R. Shinn <[email protected]>
|
Generally, it LGTM. And thanks for the contribution. But please wait for the full review after the holidays. |
|
|
||
| return source_files | ||
|
|
||
| def _convert_to_chroot_path(self, host_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is strange. It does not ring bell to me. This is effectively the reverse function of buildroot.make_chroot_path(), right?
I would put it to buildroot.py and call it from_chroot_path().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference this functions job is to let it run other tools like rpmspec from the build host, which we need to expand all the macros in the rpm:
For example say the spec declares:
%define gitver f6d60fd
and you have this:
Source0: app-%{version}-%{gitver}.tar.gz
I poked around and saw 2 other plugins doing something similar:
ubreq.py
uses short_path = path[len(self.buildroot.rootdir):] <- this one is almost identical to what I am doing, so this is what I went with to minimize changes
and
rpkg_preprocessor.py
rootdir_prefix = self.buildroot.make_chroot_path()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.buildroot.make_chroot_path()
This is reverse to what you are doing. self.buildroot.make_chroot_path('FOO') is basically self.buildroot.rootdir + '/' + 'FOO'.
ad ubreq - it does not make this code a function.
The code of your function is correct. I just do not like the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, do you want me to keep it local or update buildroot.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both variants are ok for now. The latter makes a little bit more sense to me.
| @@ -0,0 +1,2025 @@ | |||
| # Copyright (C) 2025, Atomicorp, Inc. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As vcs-diff-lint warns, this file is quite big. I would love to move to separate file these functions. E.g., get_rpm_file_info(), get_rpm_file_list()...
|
|
||
| def get_rpm_file_list(self, rpm_path): | ||
| """Extracts the list of files from an RPM file.""" | ||
| cmd = ["rpm", "-qpl", rpm_path] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python-rpm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this would work reasonably well; we have to account for bootstrapped use-cases.
|
I see more things there, but I will pause here. |
|
Thank you for the contribution! My first thought is that it’s quite a large addition. :) Would you mind moving the SBOM generator into a separate Python project? Alternatively, it could be a separate Mock sub-package providing a set of scripts for specific tasks, rather than one large code bundle. I believe this would make the project much more extensible (for example, when implementing SPDX). It also gives you more control over the code, as we won't have the capacity to give it the maintenance it requires. What if there were a script, for example sbom-generator, that was: a) copied into the build chroot (not the bootstrap chroot), and b) called by Mock? It would be nice to have a standalone script to generate component metadata for a given RPM, or SRPM file, or a specfile. Also, we already have tooling to list installed packages, so it would be great to reuse that instead of adding a new implementation. |
|
@azhuzhu can you give this a brief look? How much do we overlap with the syft generator? |
Signed-off-by: Scott R. Shinn <[email protected]>
Signed-off-by: Scott R. Shinn <[email protected]>
Signed-off-by: Scott R. Shinn <[email protected]>
… error Signed-off-by: Scott R. Shinn <[email protected]>
…cleanup Signed-off-by: Scott R. Shinn <[email protected]>
Signed-off-by: Scott R. Shinn <[email protected]>
Signed-off-by: Scott R. Shinn <[email protected]>
Signed-off-by: Scott R. Shinn <[email protected]>
…e.now() Signed-off-by: Scott R. Shinn <[email protected]>
|
@praiskup I definitely don't have a problem maintaining this as an external python library that hooks a lighter weight mock plugin if thats the way you all want to go. Getting everyones feedback here has been priceless either way, since unless you use mock and understand just how complete of a picture it can present is a difficult to explain to someone who hasnt used it. Or seen how easy it is My roadmap for this project is to expand it much farther into source code inter-dependency analysis, across multiple programming languages based on the presentation I saw here: https://www.darpa.mil/research/programs/enhanced-sbom-for-optimized-software-sustainment |
| return properties | ||
|
|
||
| def parse_spec_file(self, spec_path): | ||
| """Parses a spec file to extract source and patch files with their hashes and signatures.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW you can use https://github.com/packit/specfile/ This is packaged in Fedora and EPEL. And it ease the parsing of specfile a lot.
| component_obj["licenses"] = [ | ||
| { | ||
| "license": { | ||
| "id": license_str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks license.id must be one valid string listing in CycloneDX v1.5 scheme. So probably you can use SPDX License Expression instead with licensfedora2spdx of license-validate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost all Fedora packages have a license in the SPDX License Expression format. And almost all RHEL 10 packages.
I suggest assuming that the license string in RPM is in SPDX format. And if not, then it is a bug of that specific package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@azhuzhu can you give this a brief look? How much do we overlap with the syft generator?
Although both are to generate an SBOM for RPM, this plugin is different from the SBOM generator we have.
- This plugin is based on the mock buildroot for one SRPM with a target arch; Our approach is based on an SRPM with multiple arches. So this doesn't have an SRPM Component as the root.
- There are a list of sources and patches here as component items, but in our generator, we only record source archives as Ancestors of the SRPM component
- This plugin also parse runtime deps, which we didn't do.
- This plugin doesn't use other external tools (like
syft) to scan the content of (S)RPMS and merge the result in the output but we do. - Some details of the logic are different
| if len(parts) < 3: | ||
| continue | ||
| package_name = parts[0].strip() | ||
| package_version = parts[1].strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this include ${ARCH}, but in other places calling _generate_purl, it's f"{VERSION}-{RELEASE}"?
Signed-off-by: Scott R. Shinn <[email protected]>
Signed-off-by: Scott R. Shinn <[email protected]>
config_opts['sbom_generator_opts'] = {
'type': 'spdx',
}
Signed-off-by: Scott R. Shinn <[email protected]>
|
SPX support option added with: general pylint cleanup also completed |
Pull Request: Add SBOM Generator Plugin
Summary
This PR adds a new plugin,
sbom_generator, which automatically generates a Software Bill of Materials (SBOM) for packages built with Mock.How it Works
The plugin runs as a
postbuildhook within the Mock lifecycle. It performs the following actions:Produced Artifacts
The plugin generates a single output file in the build results directory:
<name>-<version>-<release>.sbom: A valid CycloneDX 1.5 JSON file containing:Configuration
The following options are available in
config.pyto control the plugin behavior:generate_sbom(default:True): Enable or disable SBOM generation.include_file_components(default:True): Includes individual file entries and their hashes in the SBOM.include_file_dependencies(default:False): Maps explicit dependencies between files and their parent packages in the dependency graph.include_debug_files(default:False): Determines if debuginfo files are included in the manifests.include_man_pages(default:True): Determines if manual pages and documentation are included.include_source_dependencies(default:True): Establishes provenance by linking binary packages back to their source files (SRPMs/patches).include_toolchain_dependencies(default:False): Includes every package in the build environment as a dependency for full build-root auditing.Release Note
Added a new SBOM generator plugin. It creates a
<name>-<version>-<release>.sbomfile in the build results directory, documenting the package components, hashes, signatures, and build environment hardening status.